Reordering Constraint Based on Document-Level Context
نویسندگان
چکیده
One problem with phrase-based statistical machine translation is the problem of longdistance reordering when translating between languages with different word orders, such as Japanese-English. In this paper, we propose a method of imposing reordering constraints using document-level context. As the documentlevel context, we use noun phrases which significantly occur in context documents containing source sentences. Given a source sentence, zones which cover the noun phrases are used as reordering constraints. Then, in decoding, reorderings which violate the zones are restricted. Experiment results for patent translation tasks show a significant improvement of 1.20% BLEU points in JapaneseEnglish translation and 1.41% BLEU points in English-Japanese translation.
منابع مشابه
Novel Reordering Approaches in Phrase-Based Statistical Machine Translation
This paper presents novel approaches to reordering in phrase-based statistical machine translation. We perform consistent reordering of source sentences in training and estimate a statistical translation model. Using this model, we follow a phrase-based monotonic machine translation approach, for which we develop an efficient and flexible reordering framework that allows to easily introduce dif...
متن کاملTree Kernel-based SVM with Structured Syntactic Knowledge for BTG-based Phrase Reordering
Structured syntactic knowledge is important for phrase reordering. This paper proposes using convolution tree kernel over source parse tree to model structured syntactic knowledge for BTG-based phrase reordering in the context of statistical machine translation. Our study reveals that the structured syntactic features over the source phrases are very effective for BTG constraint-based phrase re...
متن کاملA Topic-Based Reordering Model for Statistical Machine Translation
Reordering models are one of essential components of statistical machine translation. In this paper, we propose a topic-based reordering model to predict orders for neighboring blocks by capturing topic-sensitive reordering patterns. We automatically learn reordering examples from bilingual training data, which are associated with document-level and word-level topic information induced by LDA t...
متن کاملEnhanced Compressed RTP (CRTP) for Links with High Delay, Packet Loss and Reordering
Enhanced Compressed RTP (CRTP) for Links with High Delay, Packet Loss and Reordering Status of this Memo This document specifies an Internet standards track protocol for the Internet community, and requests discussion and suggestions for improvements. Please refer to the current edition of the "Internet Official Protocol Standards" (STD 1) for the standardization state and status of this protoc...
متن کامل2) United States Patent Reducing Latency in Video Encoding and Decoding I Start
(74) Attorney, Agent, or Firm * Aaron Chatterjee; Andrew Sanders; Micky Minhas (57) ABSTRACT Techniques and tools for reducing latency in video encoding and decoding by constraining latency due to reordering of video frames, and by indicating the constraint on frame reor dering latency with one or more syntax elements that accom pany encoded data for the video frames. For example, a real time c...
متن کامل